positive-confidence data
Sparse classification with positive-confidence data in high dimensions
Mai, The Tien, Nguyen, Mai Anh, Nguyen, Trung Nghia
High-dimensional learning problems, where the number of features exceeds the sample size, often require sparse regularization for effective prediction and variable selection. While established for fully supervised data, these techniques remain underexplored in weak-supervision settings such as Positive-Confidence (Pconf) classification. Pconf learning utilizes only positive samples equipped with confidence scores, thereby avoiding the need for negative data. However, existing Pconf methods are ill-suited for high-dimensional regimes. This paper proposes a novel sparse-penalization framework for high-dimensional Pconf classification. We introduce estimators using convex (Lasso) and non-convex (SCAD, MCP) penalties to address shrinkage bias and improve feature recovery. Theoretically, we establish estimation and prediction error bounds for the L1-regularized Pconf estimator, proving it achieves near minimax-optimal sparse recovery rates under Restricted Strong Convexity condition. To solve the resulting composite objective, we develop an efficient proximal gradient algorithm. Extensive simulations demonstrate that our proposed methods achieve predictive performance and variable selection accuracy comparable to fully supervised approaches, effectively bridging the gap between weak supervision and high-dimensional statistics.
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Data Science > Data Mining (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Binary Classification from Positive-Confidence Data
Can we learn a binary classifier from only positive data, without any negative data or unlabeled data? We show that if one can equip positive data with confidence (positive-confidence), one can successfully learn a binary classifier, which we name positive-confidence (Pconf) classification. Our work is related to one-class classification which is aimed at describing the positive class by clustering-related methods, but one-class classification does not have the ability to tune hyper-parameters and their aim is not on discriminating positive and negative classes. For the Pconf classification problem, we provide a simple empirical risk minimization framework that is model-independent and optimization-independent. We theoretically establish the consistency and an estimation error bound, and demonstrate the usefulness of the proposed method for training deep neural networks through experiments.
Reviews: Binary Classification from Positive-Confidence Data
Overview and Recommendation: This paper presents an algorithm and theoretical analysis for Pconf learning of a binary classifier, when only the positive instances and their conditional class probabilities are available. In contrast to PU learning, which has received attention recently in the ML community, Pconf learning does not require a large amount of unlabeled data or any knowledge of class priors, however it does require confidence weights p(y 1 x) for each example x sampled from p(x y 1). The paper is well written and this solution can be useful for many real world applications, therefore, it is a clear accept. Although, the analysis is very similar to the recent analysis for PU learning published by Du Plessis et al. recently at NIPS but the problem setting is novel and interesting. However, I think that the authors should fix some technical issues with their simulation experiment.
Binary Classification from Positive-Confidence Data
Ishida, Takashi, Niu, Gang, Sugiyama, Masashi
Can we learn a binary classifier from only positive data, without any negative data or unlabeled data? We show that if one can equip positive data with confidence (positive-confidence), one can successfully learn a binary classifier, which we name positive-confidence (Pconf) classification. Our work is related to one-class classification which is aimed at "describing" the positive class by clustering-related methods, but one-class classification does not have the ability to tune hyper-parameters and their aim is not on "discriminating" positive and negative classes. For the Pconf classification problem, we provide a simple empirical risk minimization framework that is model-independent and optimization-independent. We theoretically establish the consistency and an estimation error bound, and demonstrate the usefulness of the proposed method for training deep neural networks through experiments.